Identifying Collocations to Measure Compositionality: Shared Task System Description

نویسنده

  • Ted Pedersen
چکیده

This paper describes three systems from the University of Minnesota, Duluth that participated in the DiSCo 2011 shared task that evaluated distributional methods of measuring semantic compositionality. All three systems approached this as a problem of collocation identification, where strong collocates are assumed to be minimally compositional. duluth1 relies on the t-score, whereas duluth-2 and duluth-3 rely on Pointwise Mutual Information (pmi). duluth-1 was the top ranked system overall in coarse–grained scoring, which was a 3-way category assignment where pairs were assigned values of high, medium, or low compositionality.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Measuring the Compositionality of Collocations via Word Co-occurrence Vectors: Shared Task System Description

A description of a system for measuring the compositionality of collocations within the framework of the shared task of the Distributional Semantics and Compositionality workshop (DISCo 2011) is presented. The system exploits the intuition that a highly compositional collocation would tend to have a considerable semantic overlap with its constituents (headword and modifier) whereas a collocatio...

متن کامل

Shared Task System Description: Frustratingly Hard Compositionality Prediction

We considered a wide range of features for the DiSCo 2011 shared task about compositionality prediction for word pairs, including COALS-based endocentricity scores, compositionality scores based on distributional clusters, statistics about wordnet-induced paraphrases, hyphenation, and the likelihood of long translation equivalents in other languages. Many of the features we considered correlate...

متن کامل

Shared Task System Description: Measuring the Compositionality of Bigrams using Statistical Methodologies

The measurement of relative compositionality of bigrams is crucial to identify Multi-word Expressions (MWEs) in Natural Language Processing (NLP) tasks. The article presents the experiments carried out as part of the participation in the shared task ‘Distributional Semantics and Compositionality (DiSCo)’ organized as part of the DiSCo workshop in ACLHLT 2011. The experiments deal with various c...

متن کامل

Relative Compositionality of Multi-word Expressions: A Study of Verb-Noun (V-N) Collocations

Recognition of Multi-word Expressions (MWEs) and their relative compositionality are crucial to Natural Language Processing. Various statistical techniques have been proposed to recognize MWEs. In this paper, we integrate all the existing statistical features and investigate a range of classifiers for their suitability for recognizing the non-compositional Verb-Noun (V-N) collocations. In the t...

متن کامل

Distributional Semantics and Compositionality 2011: Shared Task Description and Results

This paper gives an overview of the shared task at the ACL-HLT 2011 DiSCo (Distributional Semantics and Compositionality) workshop. We describe in detail the motivation for the shared task, the acquisition of datasets, the evaluation methodology and the results of participating systems. The task of assigning a numerical score for a phrase according to its compositionality showed to be hard. Man...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011